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This study compares the forecasting accuracy in stock price prediction of two 
widely established models - a more traditional autoregressive integrated 
moving average (ARIMA) model and a deep learning network, the long short- 
term memory (LSTM) model. They perform exceptionally well in time series 
data analysis and are applied to ten different stock tickers, comprising 
exchange-traded funds (ETFs) from different market sectors for the purpose 
of this study. The parameters in both models were optimised and this process 
revealed several differences from existing literature with regards to the 
optimal combination of parameters in both models. Upon comparing their 
performances, despite being more accurate when making point predictions, 
the ARIMA was outperformed significantly by LSTMs in terms of long-term 
predictions. Point predictions made by ARIMA were found to have similar 
accuracies as the long-run predictions made by LSTMs. 
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1. INTRODUCTION 

Today’s financial climate has seen stock prices become more volatile and unprecedented, resulting in 
the prediction of stock prices becoming increasingly challenging due to their reliance on historical price data 
and patterns which might not reflect recent trends. Time series forecasting can be performed via several 
methods but they fall into two broad categories, traditional methods and deep learning networks. Traditionally, 
there have been several techniques, such as the autoregressive (AR), Autoregressive moving average (ARMA), 
simple exponential smoothing (SES) and most notably the ARIMA model [1]. The ARIMA model has been 
designed to predict future data points in a non-stationary time series with accuracy. 

With the advent of computational powers and the proliferation of new machine learning techniques, 
many deep learning methods have been developed, such as artificial neural networks (ANN), multilevel 
perceptrons (MLP), recurrent neural networks (RNN), and long short term memorys (LSTMs). LSTMs are 
capable of learning long-term dependencies and remembering information for long periods of time and thus is 
one of the best models to analyse and predict stock price time series data [2].Therefore, it is a vital question of 
whether traditional forecasting models or deep learning networks are more accurate in making forecasts. This 
study describes the structures and operations within two models, ARIMA and LSTM. In determining the 
optimal combination of parameters used in the ARIMA, combinations commonly used in past research were 
tested. It was found that despite their high performances in other experiments from existing literature, such as 
the partial autocorrelation function (PACF) test or Augmented Dickey-Fuller (ADF) test, there are numerous 
other combinations of parameters that have similarly high accuracies. This work builds upon those studies to 
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point out the fact that the predictive capabilities using different sets of parameters cannot be judged solely on 
the PACF or ADF tests and that performances vary depending on the stock ticker as well as the time period. 
Optimal combinations of parameters for both the ARIMA and LSTM were then used to build the models. The 
forecasting accuracies of the ARIMA and LSTM models were then compared by calculating their error rates. 


2. LITERATURE REVIEW 
2.1. ARIMA model 
ARIMA is a generalised ARMA model, which was introduced by Box, Jenkins, and Reinsel in 1970. 

It combines both autoregressive and moving average processes. ARIMA (p, d, g) comprises 3 parts as described, 

— Autoregressive (AR): Observations from previous time steps are input to a regression equation to predict 
the value at the next time step. This is determined by the parameter ‘p’, representing the order or number 
of time lags of the autoregressive model. 

— Integrated (I): This process differentiates the data to make the series stationary. This is determined by the 
parameter ‘d’, representing the degree of differencing. Generally, in most financial time series, a single 
differentiation is enough to make the series stationary and for the ARIMA model to be applied [3]. 

— Moving average (MA): The model takes into consideration the relationship between an observation and 
a residual error from a moving average model applied to past observations. This is determined by the 
parameter ‘q’, representing the order of the moving average. 

After differentiating the series, the ARMA model, with a time series X;, where t¢ represents the time 
index, can be represented by the following equation, 


Xp = AyXp_-4 + AQXp_g Hiv + AyX yp — O1€p-1 — O2€¢-2 — 7° — Og €t-g + €t 


where a and @ are estimated coefficients and e€ are white noise errors. 

Following that, the autoregressive process predicts the variable using a linear combination of past 
values. The moving average process then gives a prediction of the variable from a moving average model on 
past prediction errors [3]. Several studies have analysed the accuracy of ARIMA models in stock price 
forecasting. 

Results from analysing stock data from New York Stock Exchange (NYSE) and Nigeria Stock 
Exchange (NSE) using ARIMA revealed that ARIMA has a strong potential for short-term prediction and can 
compete favourably with existing techniques [4]. Similarly, the accuracy in predicting stock prices on 56 Indian 
stocks was above 85% for all market sectors [5]. ARIMA has been established to be relatively more robust and 
efficient than complex structural models for short-run forecasting. However, its performance in forecasting 
essentially relies on past values as well as previous error terms and does not assume knowledge of any 
underlying relationships unlike deep learning models [6]. 


2.2. LSTM model 

LSTMs are a variation of RNNs and have gained much recognition in time series forecasting as they 
overcome the vanishing gradient problem in RNNs and are able to remember information for a longer time. In 
a typical LSTM, a cell state runs through the entire network and each LSTM layer comprises memory cells 
which consist of gates, serving to add or remove information. Figure | depicts an individual LSTM cell with 
functions that have been numbered corresponding to the equations. The 3 gates are, 
— The Forget Gate: determines information from previous cells to be remembered; 
— The Input Gate: determines input information to be retained; and 
— The Output Gate: determines the information leaving the memory cell to both the next memory cell and 

the next neural network layer. 
The following are the equations within a memory cell of the LSTM, 


fr = 0(W; + [Ae_, Xe] + by) (1) 
ip = 0(W;- [Ay-1, Xe] + bi) (2) 
B, = tanh(Wg - [Ay_1,x:] + bg) (3) 
Or = 0(W, - [he-1, Xt] + Bo) (4) 
h, = 0, * tanh c; (5) 
Ce = fir * Cea + ig * By (6) 
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where x; is the input vector, h; is the output vector, C; is the cell state vector, f, is the forget gate vector, i, is 
the input gate vector, B, is a vector used to update the cell state subsequently, 0, is the output gate vector, W 
and b are the weight vectors and bias vectors respectively. 


Legend: 


Neural Network Pointwise Vector 


Layer/Gate Operation _‘ Transfer woncatenate Copy 


Figure |. Structure of a LSTM Cell 


LSTMs are designed to remember information and forecast time series data and thus much research 
has been done to analyse its effectiveness on stock price forecasting. When used to predict returns of the 
Chinese stock market, results confirmed significant forecasting accuracy and the promising predictive power 
of LSTM [7]. LSTMs trained with various sizes of input data were still found to predict share prices with a 
very low loss and error rate [8]. Another study emphasised the remarkably high accuracy of 94.8% and 
efficiency of the LSTM when tested on the stock price of a private sector bank in India [9]. 


2.3. Existing literature on comparisons between ARIMA and LSTM related models 

In recent times, more research has delved into the widely acclaimed LSTM, which have been found 
to outperform many models, be it conventional methods or newer deep learning networks. As such, 
comparisons between LSTM and ARIMA in stock price prediction has become a widespread topic of interest. 
Despite the success of ARIMA models in time series analysis and forecasting, LSTMs and deep learning 
models often provide greater accuracy [10], [11]. A study reflected how the average error rate obtained by 
LSTM was between 84% to 87% less than ARIMA, indicating the superiority of LSTM [1]. When applied to 
284 stocks from the S&P 500 stock market index, the results confirmed a significant reduction in prediction 
errors when LSTM is used as compared to ARIMA [3]. Further research found that LSTM is able to learn non- 
linear relationships from data, thus resulting in lower error than ARIMA [1], [12], [13]. 

Although the forecast accuracies in long term prediction of both models decrease, LSTM outperforms 
ARIMA significantly. Despite making rather accurate predictions at the beginning, the error of forecasts 
increased for ARIMA as time passed, while LSTM performed better in the long term [14]. Another study also 
found that despite ARIMA having a better accuracy in short-term prediction, LSTM is better than ARIMA in 
prediction accuracy and stability for the closing price of the SSE 50 Index in the long run [15]. 

Upon analysing the principles and prediction results of both models, LSTM had a better predictive 
ability, but was greatly affected by the data processing [16]. However, another study found that when increasing 
the amount of data, the models were trained with, from 1 year to 3 or 5 years, neither yielded an improved 
result. Despite that, LSTM forecasted with 94% peak accuracy, while ARIMA reached 56% and LSTM 
constantly outperformed ARIMA [17]. In another form of time series data, results showed that LSTM can 
reduce training error by as much as 95% as compared to ARIMA when used for spot price prediction [18]. In 
predicting Bitcoin prices, LSTM gives significantly better predictions than ARIMA as well [19], [20]. 


3. METHODOLOGY 

With the large amount of data being processed for stock price forecasting, computational approaches 
are used to build models, be it traditional ones or deep learning networks. Python modules were used to perform 
various mathematical and data-handling functions as well as extract packages. The Yahoo finance package 
(yfinance) was used in this study to retrieve our stock price data. The popular Pandas package was used to 
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convert financial time series data into suitable data structures for analysis and visualisation while Numpy was 
utilised to perform numerical and array calculations. Matplotlib was also used for the 2D and 3D plotting of 
data for further analysis. To organise and split our data into training and test sets, the Sci-kit Learn package 
was employed. 

The ARIMA was designed using the statsmodel package while the Keras package was used to build 
LSTMs, comprising LSTM and dense layers. The LSTM in this study was optimised using the loss function of 
the mean squared error and the ‘adam’ optimiser. Their various architectures will be discussed. Comparisons 
of forecasting accuracy were based on the mean squared error (MSE) of each model. 


3.1. Development of ARIMA model 

As the performance of ARIMA depends greatly on the p, d, q parameters, various combinations of 
these parameters, including those used in past literature, were tested. The funds used for this were the S&P500 
fund (SPY), Financial Select Sector SPDR Fund (XLF), Technology Select Sector SPDR Fund (XLK), 
Industrial Select Sector SPDR Fund (XLD, Materials Select Sector SPDR Fund (XLB), Energy Select Sector 
SPDR Fund (XLE), Consumer Staples Select Sector SPDR Fund (XLP), Health Care Select Sector SPDR Fund 
(XLV), Utilities Select Sector SPDR Fund (XLU) and Consumer Discretionary Select Sector SPDR Fund 
(XLY). The combinations used in past studies have been referenced accordingly within Tables 1 and 2. The 
dataset used spanned from | January 2014 to 31 December 2021 and the first 1,915 data points were used to 
train the models. The last 100 data points were used as the test set for point predictions to be made. 


Table 1. MSE for different p, d, q combinations for ARIMA on SPY, XLF, XLK, XLI and XLB funds 
aq Fund SPY XLF XLK XLI XLB 


P, 
0,1, 0 [1] 14.262 0.19185 3.7461 0.94633 0.66610 
1,1,0 14.626 0.19082 3.7778 0.95990 0.67884 
2,1, 0 [21 14.708 0.19186 3.7994 0.96037 0.67823 
3,1,0 14.787 0.19257 3.8161 0.96057 0.67877 
4,1,0 14.490 0.18846 3.7354 0.94479 0.67793 
5,1,0[1],[12] 14.588 0.19130 3.7443 0.96296 0.68739 
0, 1, 2 [22 14.734 0.19234 3.8117 0.95852 0.67777 
1, 0,0 [4], [16] 14.388 0.19222 3.7685 0.94808 0.67131 
1, 0, 2 [5] - 0.19271 - 0.96011 - 

1,1,1[15 14.664 0.19073 3.7932 0.96154 0.67431 
1,1, 2 [13 15.339 0.19208 3.7941 0.99764 0.69479 
1,2,1[14 14.606 0.19127 3.7603 0.96298 0.67877 


Table 2. MSE for different p, d, g combinations for ARIMA on XLE, XLP, XLV, XLU and XLY funds 
aq Bund XLE XLP XLV XLU XLY 


P, 4 

0,1, 0 [1] 0.85086 0.27609 1.0423 0.36149 5.0210 
1,1,0 0.85115 0.28897 1.0776 0.36217 5.0615 
2,1, 0 [21 0.85572 0.28733 1.0777 0.36786 4.9916 
3, 1,0 0.85314 0.28585 1.0781 0.37011 5.0031 
4, 1,0 0.85629 0.28458 1.1067 0.38237 4.9132 
5,1,0[1], [12] 0.85989 0.28744 1.0982 0.38525 4.9243 
0, 1, 2 [22 0.85607 0.28688 1.0824 0.36779 4.9718 
1, 0,0 [4], [16] 0.84455 0.28005 1.0499 0.36597 5.0486 
1, 0, 2 [5] 0.85003 0.29047 - 0.37251 - 
1,1,1[15 - 0.28777 1.0770 = 0.36515 5.0468 
1,1, 2 [13 - 0.28958 1.1140 0.36451 4.9888 
1, 2, 1 [14 0.85052 0.28855 1.0775 0.36268 5.0465 


As stock price data is often non-stationary in nature, some combinations of parameters were unable 
to induce stationarity and lead to the convergence of forecasts, thus returning an error (as indicated by a blank 
cell in Tables | and 2). While studies have conducted various experiments, such as the ADF and PACF tests, 
and each found an optimal combination of p, d, q parameters, the results in Tables 1 and 2 illustrate how 
different combinations have very similar accuracies. How the combinations fare against one another also vary 
depending on the ticker as a combination that is optimal for one ticker might not have the greatest predictive 
capability for another ticker. 

When comparing their performances across these 10 stock tickers, it was found a single finite 
difference without AR or MA modelling, which is the ARIMA (0, 1, 0), performs slightly better than other p, 
d, q combinations for the datasets in this work, similar to a past study [1]. This implies that the time series data 
can be modelled as the fractional integral of a white noise process (i.e. a Wiener process). This conclusion is 
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consistent with the literature dealing with financial data series analysis [23]. Thus, this combination of 
parameters will be used in the ARIMA models for this study. 


3.2. Development of LSTM model 

Another study [24] conducted prior to this investigated the optimisation of parameters in LSTMs with 
respect to the SPY fund when making long run predictions. The dataset used spanned from | January 2012 to 
31 December 2021 and the first 1200 data points were used to train the models. It was found that the architecture 
that produced the most accurate forecasts was | LSTM layer and 2 dense layers. The optimal range for the 
number of time steps was 30 and that for the number of units was 130. 4 features were found to be the most 
favourable, where the past 4 days of closing prices were input into a single LSTM cell. Thus, the LSTM used 
in this study comprised 1 LSTM layer, 2 dense layers, 30-time steps, 130 units and 4 features. 


4. RESULTS AND DISCUSSION 
4.1. Long run predictions 

This study investigated the performances of both models to make long term price predictions. The 
dataset used spanned from | January 2012 to 31 December 2021 and the first 1200 data points were used to 
train the models. Table 3 illustrates the superiority of LSTMs to ARIMA when making predictions in the long 
run, given the significantly lower MSE that they produce. This is reflective of the results found in several 
studies where LSTMs were found to outperform ARIMA [1], [3]. 

Figure 2 illustrates the long run forecasts using ARIMA and LSTM. ARIMA is only able to make a 
linear long-term prediction, as illustrated in Figure 2(a), if it is not retrained upon each daily forecast. This is 
further corroborated by numerous studies where it was found that LSTMs outperform ARIMA as they were 
able to learn non-linear relationships from data, as illustrated in Figure 2(b), unlike ARIMA which made 
directional predictions and were better in forecasting linear time series [1], [21], [25]. As shown in Figure 2(a), 
although forecasts made by ARIMA are rather similar to the actual closing prices initially, they are highly 
inaccurate in the long run [14]. 


Table 3. MSE for long run predictions 
STOCK TICKER ~_MSEFORLSTM MSE FOR ARIMA 


SPY 30.35 2,145.76 
XLF 0.28 15.09 
XLK 4.82 1,458.26 
XLI 4.20 59.29 
XLB 0.51 60.89 
XLE 1.40 2,884.28 
XLP 0.81 26.92 
XLV 6.36 62.22 
XLU 0.41 14.10 
XLY 4.24 717.25 
ARIMA Long Run Prediction for SPY Fund LSTM Long Run Prediction for SPY Fund 
45071 sexual i 450) forecast De 
400+ ---- forecast i 400; x 
350 4 z 350} Me 
¥ 3004 aw dl a ae 8 3004 5 a 
é 250 4 vail i & 250} fret all f 
200 en cae 200 | nye 
150 —+ oe 1504 pert 
A et ee Ww 
100; 100,“ 
(a) (b) 


Figure 2. Long run prediction for SPY fund for (a) ARIMA and (b) LSTM 


4.2. Point predictions 

The performances of both models in making point predictions were also studied. This was done by 
adding the actual closing price of the next day to the training set and fitting it to the model on each iteration 
after forecasting the closing price for that day. The dataset used spanned from | January 2014 to 31 December 
2021 and the first 1,990 data points were used to train the models. The last 25 data points were used as the test 
set for point predictions to be made. 
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Table 4 illustrates how the ARIMA outperforms LSTMs in making point predictions, highlighting the 
accuracy of short-term forecasts ARIMA produce [4]. ARIMA performed better when retrained upon each 
iteration and made forecasts only for the following day’s closing price as compared to its long-term prediction. 
However, this was contrary to LSTMs where retraining the model repeatedly resulted in poorer performances. 


Table 4. MSE for point predictions 
STOCK TICKER _MSEFORLSTM _MSE FOR ARIMA 


SPY 106.21 30.76 
XLF 1.00 0.33 
XLK 36.89 8.43 
XLI 4.12 1.92 
XLB 3.77 1.03 
XLE 1.49 0.92 
XLP 1.51 0.57 
XLV 8.12 1:25 
XLU 1.99 0.53 
XLY 33.30 10.03 


4.3. Summary 

Despite ARIMA having a better accuracy in short-term prediction, LSTMs are more accurate and 
stable in the long run [15]. The point predictions made by ARIMA and the long-run predictions made by 
LSTMs have similar accuracies. However, LSTMs have greater potential in improving forecasting accuracy. 

First, the use of LSTM requires the setting of several parameters in its architecture to obtain optimal 
performance. Choosing the right parameters to find the right model architecture can cause the performance to 
vary significantly [1]. This is due to the higher complexity of deep learning models which require adjustments 
to its architecture, such as the number of neurons in the input layer and hidden layer, and the tuning of other 
hyperparameters [13]. Although the parameters chosen were proven to be optimal in a study prior to this, it is 
possible that this combination is not unique and its performance might vary when the stock ticker or time period 
changes. 

Additionally, LSTMs are greatly affected by data processing and the dynamic nature of the stock 
market cannot be analysed using only historic data, but current conditions as well, including trending news in 
politics and economics that impact the behaviour of investors and consequently, stock markets [16]. Other 
studies claim that LSTMs, unlike ARIMA, require designed features as patterns cannot be automatically 
detected within data. These features include technical indicators, such as trading volume, momentum and 
volatility. This helps LSTMs distinguish between temporary price movements and long-term trends, reducing 
its vulnerability to false signals [16], [26], [27]. 


4.4. Future work 

Despite both models showing strong performances when forecasting stock prices, LSTM does have 
more potential for improvement and adjustments compared to ARIMA. This is a promising area for future 
research where LSTM models can be optimised by increasing the variety of input variables. Other forms of 
data, such as technical indicators as well as sentiment analysis could be incorporated into the LSTM model for 
a more holistic approach instead of solely analysing price data. In this regard, a few studies have integrated 
technical indicators in LSTMs [27]—[29] while others have examined the use of sentiment analysis with LSTMs 
[30]-[32]. Hence, future research could attempt combining both technical indicators and sentiment analysis 
into the model as well as other input variables such as the prices of related stocks, so that newfound revelations 
can be made. 


5. CONCLUSION 

This study has investigated the forecasting accuracies of ARIMA and LSTM when forecasting stock 
prices for all 10 different stock tickers, comprising the ETFs for various market sectors. LSTMs were found to 
perform better in long-term predictions while ARIMA was superior in point predictions. The point predictions 
made by ARIMA have similar accuracies as the long-run predictions made by LSTMs. All in all, both the 
ARIMA and LSTM are well-established models for stock price prediction and show remarkable forecasting 
accuracy. While the ARIMA model and LSTM model had similar accuracies in our study, the LSTM model 
has a larger capacity for improvement and is a captivating area to be researched upon. This study has shed light 
on the efficacy of ARIMA and LSTM models and hopes to spark further investigations and curiosity in the 
field of time series analysis and stock price forecasting. 
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